class: center, middle, inverse, title-slide # Regression Discontinuity Design ## Tutorial 6 ### Stanislav Avdeev --- # Regression discontinuity - Regression discontinuity design (RDD) is currently the darling of the econometric world for estimating causal effects without running an experiment The basic idea is this: - We look for a treatment that is assigned on the basis of being above/below a *cutoff value* of a continuous variable, for example: - if a candidate gets 50.1% of the vote they're in, 40.9% and they're out - if you're 65 years old you get Medicaid, if you're 64.99 years old you don't - if you score above 75, you'll be admitted into a "gifted and talented" (GATE) program We call these continuous variables "running variables" because we *run along them* until we hit the cutoff --- # Regression discontinuity - First, let's simulate the dataset ```r set.seed(1000) rdd.data <- tibble(test = runif(1000)*100) %>% mutate(GATE = test >= 75) %>% mutate(earn = runif(1000)*40+10*GATE+test/2) ``` --- # Regression discontinuity - Notice that the y-axis here is *In GATE*, not the outcome <!-- --> --- # Regression discontinuity - Here's how it looks when we consider the actual outcome <!-- --> --- # Regression discontinuity - If we look at the relationship between treatment and going to college, we'll be picking up the fact that higher test scores make you more likely to go to college anyway <!-- --> --- # Regression discontinuity - Except, that's not actually what the diagram looks like. Test only affects GATE to the extent that it makes you be above the 75 cutoff <!-- --> --- # Regression discontinuity - Basically, the idea is that *right around the cutoff*, treatment is randomly assigned - If you have a test score of 74.9 (not high enough for gifted-and-talented), you're basically the same as someone who has a test score of 75.0 (just barely high enough) - So we have two groups - the just-barely-missed-outs and the just-barely-made-its, that are basically exactly the same except that one happened to get treatment - A perfect description of what we're looking for in a control group - So if we just focus around the cutoff, it's basically random which side of the line you're on - But we get variation in treatment - This specifically gives us the effect of treatment *for people who are right around the cutoff* - "local average treatment effect" (we still won't know the effect of being put in gifted-and-talented for someone who gets a 30) --- # Regression discontinuity - A very basic idea of this, before we even get to regression, is to create a *binned chart* - And see how the bin values jump at the cutoff - A binned chart chops the Y-axis up into bins - Then takes the average Y value within that bin - Then, we look at how those X bins relate to the Y binned values - If it looks like a pretty normal, continuous relationship, then JUMPS UP at the cutoff X-axis value, that tells us that the treatment itself must be doing something --- # Simulation The true effect is `\(10\)` ```r #Choose a "bandwidth" of how wide around the cutoff to look (arbitrary in our example) #Bandwidth of 2 with a cutoff of 75 means we look from 75-2 to 75+2 bandwidth <- 2 #Just look within the bandwidth rdd <- rdd.data %>% filter(abs(75-test) < bandwidth) %>% #Create a variable indicating we're above the cutoff mutate(above = test >= 75) %>% #And compare our outcome just below the cutoff to just above group_by(above) %>% summarize(earn = mean(earn)) rdd ``` ``` ## # A tibble: 2 × 2 ## above earn ## <lgl> <dbl> ## 1 FALSE 55.2 ## 2 TRUE 66.0 ``` ```r #Our effect looks just about right rdd$earn[2] - rdd$earn[1] ``` ``` ## [1] 10.80055 ``` --- # Graphically <!-- --> --- # Regression discontinuity in regression - First, we need to *transform our data* - We need a "Treated" variable that's `TRUE` when treatment is applied - above or below the cutoff - Then, we are going to want a bunch of things to change at the cutoff. This will be easier if the running variable is *centered around the cutoff*. So we'll turn our running variable `\(X\)` into `\(X - cutoff\)` and call that `\(XCentered\)` Let's start with the simple linear version: $$ Y = \beta_0 + \beta_1XCentered + \beta_2Treated + \beta_3Treated\times XCentered +\varepsilon $$ - `\(\beta_2\)` is how the intercept jumps - that's the RDD effect - `\(\beta_3\)` is how the slope changes - that's the RKD effect - Sometimes the effect of interest is the interaction term - the change in slope. This answers the question "does the effect of `\(X\)` on `\(Y\)` change at the cutoff? This is called a "regression kink" design --- # Regression discontinuity in regression The true effect is `\(0.7\)` ```r set.seed(2000) df <- tibble(X = runif(1000)) %>% mutate(treated = X > .5) %>% mutate(X_centered = X - .5) %>% mutate(Y = X_centered + .7*treated + .5*X_centered*treated + rnorm(1000,0,.3)) lm(Y ~ treated*X_centered, data = df) ``` ``` ## ## Call: ## lm(formula = Y ~ treated * X_centered, data = df) ## ## Coefficients: ## (Intercept) treatedTRUE X_centered ## -0.01113 0.74669 0.98250 ## treatedTRUE:X_centered ## 0.44696 ``` You can take this basic interaction-with-cutoff design idea and use it to look at how *anything* changes before and after cutoff, not just the level of `\(Y\)`. You could look at how the *slope* changes ("regression kink") --- # Graphically - The true model is an RDD effect of `\(0.7\)`, with a slope of `\(1\)` to the left of the cutoff and a slope of `\(1.5\)` to the right <!-- --> --- # Choices - Bandwidth choice - Functional form - Controls --- # Bandwidth choice - The idea of RDD is that people *just around the cutoff* are very much comparable - Basically random if your test score is 74 vs. 76 if the cutoff is 75, for example - So people far away from the cutoff aren't too informative. At best they help determine the slope of the fitted lines - So we might limit our analysis within just a narrow window around the cutoff - This makes the exogenous-at-the-jump assumption more plausible, and lets us worry less about functional form (over a narrow range, not too much difference between a linear term and a square), but on the flip side reduces our sample size considerably - Imbens and Gelman (2018) show that the "naive" RDD estimators place high weights on observations far from the threshold - So it's better to drop these observations --- # Bandwidth choice - RDD generally uses data only from the observations in a given range around the cutoff - Or at least weights them less the further away they are from cutoff - How wide should the bandwidth be? - There's a big wide literature on *optimal bandwidth selection* which balances the addition of bias (from adding people far away from the cutoff who may have back doors) vs. variance (from adding more people so as to improve estimator precision) - We won't be doing this by hand, we can often rely on an RDD command to do this for us - The `rdrobust` package in R implements some state of the art literature on this subject. --- # Bandwidth choice - Pay attention to the sample sizes, accuracy (true value `\(0.7\)`) and standard errors ```r m1 <- lm(Y~treated*X_centered, data = df) m2 <- lm(Y~treated*X_centered, data = df %>% filter(abs(X_centered) < .25)) m3 <- lm(Y~treated*X_centered, data = df %>% filter(abs(X_centered) < .1)) m4 <- lm(Y~treated*X_centered, data = df %>% filter(abs(X_centered) < .05)) m5 <- lm(Y~treated*X_centered, data = df %>% filter(abs(X_centered) < .01)) export_summs(m1,m2,m3,m4,m5, statistics = c(N = 'nobs'), coefs = 'treatedTRUE') ```
Model 1
Model 2
Model 3
Model 4
Model 5
treatedTRUE
0.75 ***
0.77 ***
0.71 ***
0.61 ***
0.56
(0.04)
(0.06)
(0.09)
(0.15)
(0.43)
N
1000
492
206
93
15
*** p < 0.001; ** p < 0.01; * p < 0.05.
--- # Functional form - Why fit a straight line on either side? If the true relationship is curvy this will give us the wrong result - We can be much more flexible. As long as we fit some sort of line on either side, we can look for the jump - The way to do this is with polynomials $$Y = \beta_0 + \beta_1XCentered + \beta_2XCentered^2 + \beta_3Treated + $$ `$$\beta_4Treated\times XCentered + \beta_5Treated\times XCenrtered^2 + \varepsilon$$` - `\(\beta_3\)` remains our "jump at the cutoff" - our RDD estimate --- # Functional form The true effect is `\(0.7\)` ``` ## ## Call: ## lm(formula = Y ~ X_centered * treated + I(X_centered^2) * treated, ## data = df) ## ## Coefficients: ## (Intercept) X_centered ## -0.03397 0.69904 ## treatedTRUE I(X_centered^2) ## 0.76774 -0.57215 ## X_centered:treatedTRUE treatedTRUE:I(X_centered^2) ## 0.75094 0.53190 ``` --- # Functional form - The interpretation is the same as before - look for the jump - We want to be careful with polynomials though, and not add too many - Remember, the more polynomial terms we add, the stranger the behavior of the line at *either end* of the range of data - And the cutoff is at the far-right end of the pre-cutoff data and the far-left end of the post-cutoff data - So we can get illusory effects generated by having too many terms - A common approach is to use *non-parametric* regression or *local linear regression* - This doesn't impose any particular shape. And it's easy to get a prediction on either side of the cutoff - This allows for non-straight lines without dealing with the issues polynomials bring us --- # Functional form - Looking purely just at the cutoff and making no use of the space *away* from the cutoff throws out a lot of useful information - We know that the running variable is related to outcome, so we can probably improve our *prediction* of what the value on either side of the cutoff should be if we *use data away from the cutoff to help with prediction* than if we *just use data near the cutoff*, which is what that animation does - We can do this with OLS - The bin plot we did can help us pick a functional form for the slope --- # Functional form - Let's look at the same data with a few different functional forms - Remember, the RDD effect is the jump at the cutoff. The TRUE effect here will be `\(0.7\)`, and the TRUE model is an order-2 polynomial ```r tb <- tibble(Running = runif(200)) %>% mutate(Y = 1.5*Running - .6*Running^2 + .7*(Running > .5) + rnorm(200, 0, .25)) %>% mutate(X_centered = Running - .5, Treated = Running > .5) ``` --- # Functional form <!-- --> --- # Functional form <!-- --> --- # Functional form <!-- --> --- # Functional form <!-- --> --- # Functional form <!-- --> --- # Functional form <!-- --> --- # Functional form - Avoid higher-order polynomials - Even the "true model" can be worse than something simpler sometimes - And fewer terms makes more sense too once we apply a bandwidth and zoom in - Be very suspicious if your fit veers wildly off right around the cutoff - Consider a nonparametric approach --- # Controls - Generally you don't need control variables in an RDD - If the design is valid, you've closed all back doors. That's sort of the whole point - Although maybe we want some if we have a wide bandwidth - this will remove some of the bias - Still, we can get real value from having access to control variables. How? - Control variables allow us to perform *placebo tests* of our RDD model - We can rerun our RDD model, but simply use a control variable as the outcome - We should not find any effect (outside of the levels expected by normal sampling variation) - You can run these for *every control variable you have* --- # Balance - One thing that's so great about RDD is that, since it's basically random whether you're on one side of the cutoff or another, there shouldn't be other back doors - It's a form of within variation that's *so narrow* it basically closes everything - We can check this by seeing if other variables differ on either side of the line --- # Assumptions - We knew there must be some assumptions lurking around here - Some are more obvious (we should be using the correct functional form) - Others are trickier. What are we assuming about the error term and endogeneity here? - Specifically, we are assuming that *the only thing jumping at the cutoff is treatment* - Sort of like parallel trends, but maybe more believable since we've narrowed in so far - For example, if having an income below 150% of the poverty line gets you access to food stamps AND to job training, then we can't really use that cutoff to get the effect of just food stamps - The only thing different about just above/just below should be treatment - What if the running variable is *manipulated*? --- # Manipulated running variables - Imagine you're a teacher grading the gifted-and-talented exam. You see someone with an 74 and think "aww, they're so close! I'll just give them an extra point..." - Suddenly, that treatment is a lot less randomly assigned around the cutoff - If there's manipulation of the running variable around the cutoff, we can often see it in the presence of *lumping* - In other words, there's a big cluster of observations to one side of the cutoff and a seeming gap missing on the other side - How can we check this? - We can look graphically by just checking for a jump at the cutoff in *number of observations* after binning --- # Manipulated running variables - Here's an example from the real world in medical research - statistically, p-values *should* be uniformly distributed - But it's hard to get insignificant results published in some journals. So people might "p-hack" until they find some form of analysis that's significant, and also we have heavy selection into publication based on `\(p < .05\)`. Can't use that cutoff for an RDD  --- # Manipulated running variables - The first one looks pretty good. We have one that looks not-so-good on the right <!-- --> --- # Manipulated running variables - Another thing we can do is do a "placebo test" - Check if variables *other than treatment or outcome* vary at the cutoff - We can do this by re-running our RDD but just swapping out some other variable for our outcome - If we get a significant jump, that's bad. That tells us that *other things are changing at the cutoff* which implies some sort of manipulation (or just super lousy luck) --- # Fuzzy regression discontinuity - What if treatment is not determined sharply by the cutoff? - We can account for this with a model designed to take this into account - Specifically, we can use something called two-stage least squares (instrumental variables) to handle these sorts of situations - Basically, two-stage least squares estimates how much the chances of treatment go up at the cutoff, and scales the estimate by that change --- # Fuzzy regression discontinuity - Notice that the y-axis here isn't the outcome, it's "percentage treated" <!-- --> --- # Fuzzy regression discontinuity - We can perform this using `feols` from **fixest**, giving it two treatment-response functions - The first is an RDD specification where we use "treatment" - i.e. whether you were actually treated - The second uses the same RDD specification, but replaces "treatment" with "above the cutoff" --- # Fuzzy regression discontinuity <!-- --> --- # Fuzzy regression discontinuity - What happens if we just do RDD as normal? - The effect is understated because we have some untreated in the post-cutoff and treated in the pre-cutoff - So with a positive effect the pre-cutoff value goes up (because we mix some treatment effect in there) and the post-cutoff value goes down (since we mix some untreated in there), bringing them closer together and shrinking the effect estimate --- # Fuzzy regression discontinuity <!-- --> --- # Fuzzy regression discontinuity - The true effect is `\(2\)` ```r fuzz <- fuzz %>% mutate(Above = Running >= .5) mreg <- lm(Y ~ Running*Above, data = fuzz) msummary(list(Y = mreg), stars = TRUE, gof_omit = 'AIC|BIC|F|Lik|Adj') ``` <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> 0.980*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.189) </td> </tr> <tr> <td style="text-align:left;"> Running </td> <td style="text-align:center;"> 2.574*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.638) </td> </tr> <tr> <td style="text-align:left;"> AboveTRUE </td> <td style="text-align:center;"> 1.113* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.554) </td> </tr> <tr> <td style="text-align:left;"> Running × AboveTRUE </td> <td style="text-align:center;"> −0.677 </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1px"> </td> <td style="text-align:center;box-shadow: 0px 1px"> (0.935) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 150 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.590 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> --- # Fuzzy regression discontinuity - The true effect is `\(2\)` <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> 0.980*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.189) </td> </tr> <tr> <td style="text-align:left;"> Running </td> <td style="text-align:center;"> 2.574*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.638) </td> </tr> <tr> <td style="text-align:left;"> AboveTRUE </td> <td style="text-align:center;"> 1.113* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.554) </td> </tr> <tr> <td style="text-align:left;"> Running × AboveTRUE </td> <td style="text-align:center;"> −0.677 </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1px"> </td> <td style="text-align:center;box-shadow: 0px 1px"> (0.935) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 150 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.590 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> --- # Fuzzy regression discontinuity - We can scale by how much the treatment prevalence jumped... if the chance of being treated only went up by 50%, then the effect we see should be 50% as large, so let's adjust that away ```r mreg <- lm(Y ~ Running*Above, data = fuzz) mtr <- lm(Treat ~ Running*Above, data = fuzz) ``` --- # Fuzzy regression discontinuity - We can try literally dividing the effect on `\(Y\)` by the effect on `\(Treated\)`: 1.113 / 0.663 = 1.678 <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Y </th> <th style="text-align:center;"> Treated </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> 0.980*** </td> <td style="text-align:center;"> 0.003 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.189) </td> <td style="text-align:center;"> (0.069) </td> </tr> <tr> <td style="text-align:left;"> Running </td> <td style="text-align:center;"> 2.574*** </td> <td style="text-align:center;"> 0.647** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.638) </td> <td style="text-align:center;"> (0.231) </td> </tr> <tr> <td style="text-align:left;"> AboveTRUE </td> <td style="text-align:center;"> 1.113* </td> <td style="text-align:center;"> 0.663** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.554) </td> <td style="text-align:center;"> (0.201) </td> </tr> <tr> <td style="text-align:left;"> Running × AboveTRUE </td> <td style="text-align:center;"> −0.677 </td> <td style="text-align:center;"> −0.287 </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1px"> </td> <td style="text-align:center;box-shadow: 0px 1px"> (0.935) </td> <td style="text-align:center;box-shadow: 0px 1px"> (0.339) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 150 </td> <td style="text-align:center;"> 150 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.590 </td> <td style="text-align:center;"> 0.627 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> --- # Fuzzy regression discontinuity - Or can use instrumental variables (IV) for this, with being above the cutoff as an instrument of treatment ```r ivr <- ivreg(Y ~ Running*Treat | Running*Above, data = fuzz) ``` --- # Fuzzy regression discontinuity <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Instrumental Variables </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> 0.970*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.132) </td> </tr> <tr> <td style="text-align:left;"> Running </td> <td style="text-align:center;"> 1.599** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.564) </td> </tr> <tr> <td style="text-align:left;"> Treat </td> <td style="text-align:center;"> 1.616*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.439) </td> </tr> <tr> <td style="text-align:left;"> Running × Treat </td> <td style="text-align:center;"> −0.235 </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1px"> </td> <td style="text-align:center;box-shadow: 0px 1px"> (0.586) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 150 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.822 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> --- # Regression discontinuity in action - There are additional estimation details that are difficult to do yourself - There are optimal bandwidth selection operators - There is bias introduced by taking points away from the cutoff, but also available corrections for that bias - We probably want to use a command that does this stuff for us --- # Regression discontinuity in action - The **rdrobust** package has the `rdrobust` function which runs regression discontinuity with: - Options for fuzzy RD - Optimal bandwidth selection - Bias correction - Lots of options (including the addition of covariates) --- # Regression discontinuity in action - The true effect is `\(0.7\)` ```r library(rdrobust) m <- rdrobust(tb$Y, tb$Running, c = .5) ``` --- # Regression discontinuity in action - The true effect is `\(0.7\)` ``` ## ## Call: ## lm(formula = Y ~ X_centered * Treated + I(X_centered^2) * Treated + ## I(X_centered^3) * Treated + I(X_centered^4) * Treated + I(X_centered^5) * ## Treated + I(X_centered^6) * Treated + I(X_centered^7) * Treated + ## I(X_centered^8) * Treated, data = tb) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.62888 -0.14620 0.02032 0.15457 0.69750 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 2.402e-01 5.838e-01 0.411 0.6812 ## X_centered -8.273e+00 4.393e+01 -0.188 0.8508 ## TreatedTRUE 1.708e+00 6.904e-01 2.474 0.0143 * ## I(X_centered^2) 1.187e+02 1.106e+03 0.107 0.9147 ## I(X_centered^3) 4.101e+03 1.337e+04 0.307 0.7595 ## I(X_centered^4) 3.669e+04 8.878e+04 0.413 0.6799 ## I(X_centered^5) 1.562e+05 3.408e+05 0.458 0.6473 ## I(X_centered^6) 3.500e+05 7.533e+05 0.465 0.6428 ## I(X_centered^7) 3.976e+05 8.894e+05 0.447 0.6554 ## I(X_centered^8) 1.802e+05 4.340e+05 0.415 0.6785 ## X_centered:TreatedTRUE -6.780e+01 5.446e+01 -1.245 0.2147 ## TreatedTRUE:I(X_centered^2) 1.958e+03 1.427e+03 1.372 0.1717 ## TreatedTRUE:I(X_centered^3) -2.892e+04 1.766e+04 -1.638 0.1032 ## TreatedTRUE:I(X_centered^4) 1.213e+05 1.187e+05 1.022 0.3083 ## TreatedTRUE:I(X_centered^5) -7.299e+05 4.584e+05 -1.592 0.1131 ## TreatedTRUE:I(X_centered^6) 8.435e+05 1.016e+06 0.831 0.4073 ## TreatedTRUE:I(X_centered^7) -1.723e+06 1.199e+06 -1.437 0.1524 ## TreatedTRUE:I(X_centered^8) 4.288e+05 5.841e+05 0.734 0.4638 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.2486 on 182 degrees of freedom ## Multiple R-squared: 0.8697, Adjusted R-squared: 0.8575 ## F-statistic: 71.43 on 17 and 182 DF, p-value: < 2.2e-16 ``` --- # Regression discontinuity in action - Or, easily plot the results. Note the default uses order-4 polynomial unlike `rdrobust` which is local linear ```r rdplot(tb$Y, tb$Running, c = .5) ``` --- # Regression discontinuity in action - The true effect is `\(0.7\)` <!-- --> --- # References - Huntington-Klein, N. The Effect: An Introduction to Research Design and Causality, https://theeffectbook.net - Huntington-Klein, N. Causal Inference Class Slides, https://github.com/stnavdeev/CausalitySlides - Huntington-Klein, N. Econometrics Course Slides, https://github.com/stnavdeev/EconometricsSlides